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Zj ' We discuss the task of reconstructing the topological map of an environment based 

<D " 

Q ■ on the sequences of locations visited by a mobile agent - this occurs in systems 

neuroscience, where one runs into the task of reconstructing the global topological 
map of the environment based on activation patterns of the place coding cells in 
hippocampus area of the brain. A similar task appears in the context of establishing 
win connectivity maps. 
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I. INTRODUCTION 

00 
CO 

This paper 1 considers how to infer a topological representation of an environment from 

a trace of place information. By place information we assume that this takes the form of a 

finite set of places names, of unknown location, and a time series specifying when a mobile 
• ■ 

agent is at these locations. The places may overlap spatially (and thus more than one 
place name may be simultaneously "active"), but are assumed to be unique (not spatially 
identical) . 

One example of such data (which was our original motivation) are the firing traces of 
place cells (PCs) of a rat hippocampus after it has learned a particular environment 
After a period of exposure to a particular spatial environment, cells in the rat hippocampus 
are reliably associated with particular regions of space, place fields (PFs) such that the PCs 
"fire" (i.e. have a firing frequency above a certain threshold) when the rat is in the PF, and 



1 Thanks are due to our colleagues and the anonymous referees for their useful comments. 



FIG. 1: The RCC5 relations, from the left: DC, EC, PO, PP, PPi, EQ. 



only then 2 . Can the spatial layout of the environment be recovered purely by inspection of 
the firing activity of the PCs? Recovering a topological map from PC data in this way has 
been called the "space reconstructing thought experiment" (SRE)|2]. This paper builds on 

;his unpublished report ||2] where the relevant neuroscience literature is surveyed in detail. 

2]] also suggests the use of homology theory and mereotopology to analyse the PC data. 
Here we explore the second suggestion. This problem may occur in other domains; one 
example is the set of visible SSIDs 3 of wifi base stations^, 4|. 

We take the position here that places are regions rather than points, not only because 
it may be hard to determine when the agent is at the central "focus" of a place, but also 
because this naturally fits with the two domains mentioned above. This naturally suggests 
the use of a region based representation such as RCC|5]. In both these domains it is un- 
likely that regions will ever be tangentially connected (the EC or TPP relations of the RCC 
mereotopological calculus, or similar relations in other calculi [5|). Thus we will use a purely 
mereo logical calculus, RCC-5Q], illustrated in fig. CD Connectivity between places implies 
that two regions at least partially overlap (PO). The RCC relations can all be defined in 
terms of the connection primitive C(x, y) [5], u x is connected to y, as can a predicate Con(x), 
which is true when x is a simply connected region (i.e. is one-piece), and a predicate P(x, y) 
which is true when x is a part of y (i.e. either EQ(x,y) or PP(x,y) holds). The underly- 
ing question we are trying to answer here, is whether and how can a topological map be 
extracted from the sparse time series data indicated above. In particular we deliberately as- 
sume that there is no information as to the agent's actual metrical movements, orientation, 
odometry or heading. This might be because such information is hard to obtain or compute, 
but seems to be an interesting challenge in its own right. 



2 Predictive firings have been reported, but we ignore that complexity here. 

3 In the case of multiple base stations with the same SSID, the mac address can be used to distinguish 
them. 
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The idea of computing a topological representation of the environment of a mobile agent 
is not new, and has been the subject of active research in robotics. The work on the "spa- 
tial semantic hierarchy" is perhaps the most relevant here This framework provides a 
multilevel approach to computing a topological representation, taking account of metrical, 
geometrical information and potentially a wide variety of sensor readings. Topological infor- 
mation is thus inferred after several layers of processing, and "describes the environment as 
a collection of places, paths and regions, linked by topological relations, such as connectivity, 
order, boundary and containment". Places are always points, paths are one dimensional, 
and regions two dimensional, e.g. defined by a closed loop of paths, or by abstraction of 
a group of places. The process of inferring the topology of these entities is by abduction 
using circumscription: the underlying idea is to abduce a minimal topological description 
which explains the underlying sensor data. Although relevant to the considerations in this 
paper, the spatial semantic hierarchy framework differs in several respects; in particular it 
assumes, and makes use of non topological information and it allows for the possibility of 
aliasing (i.e. that place names are not necessarily unique). Finally, we mention the Ratslam 
approach to simultaneous location and mapping (SLAM) inspired by the neuroscience work 



on the rat hippocampus 



111 ]. This uses vision, odometry and neural networks to produce 



a topological map based on PCs and is contrasted with the more usual approach to SLAM 
using a probabilistic approach based on particles. Ratslam uses both head direction and 
PCs (i.e. both directional and positional information) to produce its maps. Here we only 
use PCs in order to concentrate on the purely topological aspects of the problem. 



II. COMPUTING CONNECTIVITY 



We assume that we have a finite set of k places, p^ distinguished by a predicate Place(^i), 
whose intended interpretations are regions of 2D space. We also assume a finite set of n 
times, with a primitive ordering relation ti < t% which specifies when t\ is temporally before 
t 2 and a predicate At(i,p) which specifies whether the agent is at place p at time t. For the 
present, we also assume that time is sufficiently fine grained such that no place transitions 
are missed, i.e. not recorded. We will return to this assumption below in §1111 If the agent 
is at two places simultaneously then we can infer that they are connected: 

VxVyBt[At(t, x) A At(i, y)} -> C(x, y) 
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Note that for the converse to hold, we would have to assume that the agent had made a com- 
plete exploration of the environment, in the sense that every actually physically connected 
pairs of places had actually been visited at consecutive times. For the case of rats running 
experimental mazes in laboratories, this is certainly the case after a relatively short period 
of time. For large scale geographic environments this may not hold, though it may be a 
reasonable "closed world" assumption to make, until contrary evidence comes in, requiring 
a non monotonic revision of the topological map. Of course map revisions may have to be 
made in any case as a result of structural changes in the environment or the number or 
liveliness of the place indicators. 

The set of pairwise place connections can clearly can be computed in at most k 2 *n time, 
since it simply requires that each time point is scanned in turn, checking all pairs of places 
for whether they are simultaneously active. 

In some domains, determining whether the agent is currently at a particular place may 
be more problematic. E.g. in the case of hippocampal PFs, a certain threshold of firing 
frequency is required, or in the wifi domain a minimum signal level might be required. Such 
additional constraints would need to be factored into the computation of At. This might 
be a global threshold, or it may be a local (spatial) threshold to the particular place, or a 
temporal threshold (i.e. different thresholds might be applicable at different times, perhaps 
due to changing environmental conditions). We will not consider this further in this paper, 
though we note that calculi for reasoning about regions with indeterminate boundaries, such 
as the "egg-yolk" calculus may be relevant. 

From the predicate C, it is straightforward to build a connectivity graph, specifying which 
regions/places are connected through the overlap relation, and thus the connection relation 
C(x, y) holds between them. From the point of view of having a representation from which 
it is possible e.g. to do path planning, this is all that is needed. But it would be useful 
to be able to determine the overall "topological shape" of the environment as exposed by 
the time series data. Below we show how this might be done; we are not proposing the 
particular definitions and concepts below as final, but rather as illustrative of the kind of 
analysis possible. 

We first consider environments which are essentially linear, such as the typical mazes run 
by rodents. These might be simple linear runs, or with junctions (e.g. in the shape of a 
T, W, H),or more complex mazes, with loops (e.g. 0, 8, 9). In a strictly topological sense, 
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FIG. 2: Three ID environments formed by overlapping places - the darker areas show overlap 
between regions; to the right of each shape is the connection structure constructed by the overlaps 
between places, which can be distinguished by virtue of their connectivity structure. 

the first set of examples above are all topologically identical - they can all be shrunk to a 
point. However, by subdividing the shapes into parts (corresponding to the simple linear 
stretches), and then specifying the connectivity between these parts, these shapes can all be 
distinguished - see fig. [2f , where three shapes and their connection structure are depicted. 
Shapes (i) and (ii) are topologically identical in the sense that they can both be shrunk to 
a point, but the connection structure of their places is different. The connection structure 
can also of course be given purely symbolically; e.g. (i) is: C(a,b), C(b,c), C(b,d), C(d,e), 
and DC(a,/3), for all other pairs of regions a, (3. Visualizing this purely symbolic structure 
is an interesting problem which we return to below. In order to formally analyse such linear 
structures we can make the following definitions, which group together collections of regions 
to form higher level abstractions 5 . 
LinearSegment(x) = de f Con(x) A SumOfPlaces(x)A 

Vy[[Place(y) A P(y,x)] -> [End(y) V Middle(y) V Junction(y)]]A 

3! 2 y[Place(?/) A P(y,x) A [End(y) V Junction (y)]] 
End(x) = def Place(x) A 3\ l y[P\ace{y) A C(x, y) A ^EQ(y, x)} 
Middle(x) = def Place(x) A 3\ 2 y[P\ace(y) A C(x, y) A -£Q{y, x)\ 
Junction(x) = de f Place(x) A 3 3 y[P\ace(y) A C(x, y) A -£Q(?/, x)\ 
SumOfPlaces(x) = de f 

-<3y[P(y, x) A Vz[[Place(^) A P(z, x)} -> DC{y, z)}} 

4 Note that this figure, and other similar ones later in the paper are intended to be purely illustrative, 
rather than realistic configurations of actual PFs recorded from real rats, or from wifi recordings. 

5 3\ n aQ(a) is syntactic sugar for "there exist exactly n a s.t. <f>(a)". 3™a$(a) is syntactic sugar for "there 
exist at least n a s.t. $(«)". 
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FIG. 3: A linear track composed of (i) doubly overlapping places, & (ii) singly overlapping places; 
beneath are the connection structures formed by the overlaps, (hi): An environment with two 
maximal induced paths: abdfg & acdfg. 

Thus Ends are places only connected to one other place, Middles only to two other places, 
and Junctions are connected to three or more other places; LinearSegments are composed 
of two places being either Ends or Junctions and all other places in the LinearSegment are 
Middles. The predicate SumOfPlaces(x) ensures that x is a region every part of which is part 
of some place, so that there are no "extra bits" of space which are part of x but not part of 
some place. 

These definitions achieve the desired effect in linear environments providing that no places 
are long enough to overlap more than one place in each direction (i.e as in fig. E](ii)). E.g., 
consider fig. [3](i) - the layout is still clearly in some sense linear, but as the connection graph 
shows, it is not so in any very straightforward way. 

One approach to this might be to use the idea of an induced path - i.e. a sequence of 
nodes in a graph such that each is connected to two neighbours in the path, except for 
the two end nodes, and with the proviso that there are no "short cuts" (i.e. direct links) 
between any two nodes in the path using other edges in the graph (which would be indicated 
by the presence of a 3-clique) 6 . Thus in the graph in fig. E^i), abde, ace, bee, acd, are all 
(maximal) induced paths. However this notion is not entirely satisfactory since it does not 
define a unique path for the environment (since there are two induced paths from a to e). In 
this case we can note that there is a unique longest induced path (abed); of the others, ace 
has the same start and end nodes, and the other two each have an end in common with the 
longest. None are disjoint. We also note that all the nodes in the non longest induced paths 



6 In the definition below, we also rule out trivial induced paths of length one (i.e. with just two nodes and 
one edge). 
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FIG. 4: A configuration in which by considering place trails from than two places simultaneously, 
a refined map can be obtained. 

are within one edge of a node on the longest induced path (i.e. c is directly linked to a,b,d 
and e). It is not guaranteed that there is a unique longest induced path - either because 
there are two entirely disjoint such paths, or because there are alternatives with common 
nodes - e.g. see fig. [3](iii). We thus propose the following definition for what we will call 
quasi linear segments. 
QuasiLin(x) = def 
3w[P(w, x) A Induced Path (w) A SumofPlaces(x)A 
Vy[[P(y, x - w) A Place(y)] -> C(y, w))) 
Induced Path (to) = de f 

SumOfPlaces(w) A Con(w) A -Bv [3Clique(t;) A P(v, w)]A 
3(s, x, y, 2)[Place(x) A Place(y) A -EQ(x, y) A Place(s) A P(s, z)A 
EQ(w, x + y + z) A Vn[[Place(M) A P(u, z)} — > ^Con{w - u)}} 
3Clique(w) = def 3(x,y,z)[ 
Place(x) A Place(y) A Place(^) A EQ(w, x + y + z) A ->EQ(x, y)A 
^EQ(x, z) A -iEQ(y, z) A C(x, y) A C(x, z) A C(z, y)] 

Returning to fig. [3]^i), Quasil_in(a+b+c+d+e) is true since it has at least one induced 
path in it, and all the other places in it are directly connected to a place in that induced 
path. Exactly what definition of "linearity" might be appropriate in a particular domain 
will depend on what is required or suitable for the application. E.g., QuasiLin(x) will be 
true if x is the union of a simple path of length m and a clique of size n s.t. exactly two 
nodes are in common between the clique and the simple path. It would be straightforward 
to eliminate this case, or to allow only cliques of up to a size m to occur within a quaslinear 
path. 

Another important grouping of places are "open spaces" which are likely to be identified 
by clusters of many places, though not necessarily in the form of a clique (though they 
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may well contain cliques). Having identified cliques, and linear segments, a natural step 
would be to replace these by "super nodes", and then continue analysing the environment 
at this more abstract level. Indeed, there are already existing approaches to analysing and 
drawing graphs which take this approach, e.g. 

as 

. E.g., "open spaces" may frequently be 

represented as cliques of cliques. 

One other aspect of connectivity analysis not so far mentioned explicitly is determining 
whether there are cycles in the environment (caused by circular structures in a linear en- 
vironment, or by obstacles in an open environment). A graph theoretic approach to this 
would be to look for chordless cycles, i.e. circular induced paths of length at least four. As 
a trivial example of this, consider fig. E](ii); in practice this approach is likely to require 
refinement to properly capture the required notion. 



III. VARIANTS OF THE MAPPING TASK 



In this section we consider a number of variants of the basic mapping task and how they 
might be achieved. 

Computing Connectivity Mereologically: The approach outlined above shows how we 
can compute connection information from knowledge that the agent is simultaneously At 
two places. This gives rise to binary connectedness information, and thence (in RCC-5) 
the knowledge that particular pairs of places partially overlap (PO). It is not possible from 
the connection structure though to infer that any triple (quadruple...) of places overlap, 
even though this information might in fact be readily apparent in the place trail (indicated 
by an agent being simultaneously at three (four...) places). We could therefore compute 
a more fine grained representation, in which we explicitly represent those intersections of 
places which are known to exist, and similarly those relative complements of places known 
to exist. E.g. from the place trail: At(tl,a), At(t2,a), At(t2,b), At(t3,a), At(t3,b), At(t3,c), 
At(t4,b), At(t4,c), At(t5,b), At(t5,c), At(t5,d), At(t6,d), using pairwise connections, as in 
^II] then a connection structure as in fig. @Ji) would be computed. Not knowing about 
which subregions can actually exist, it would be reasonable to produce a map such as in fig. 
Hlli). However, by inspecting the above place trail, we can infer that the regions a-b-c-d, 
a+b-c-d, a+b+c-d, b+c-a-b, b+c+d-a, and d-a-b-c all exist, but there is no evidence to 
support the existence of any of the other five Boolean combinations of the four places, a, b, 



9 



c, d which exist in fig. ID^ii). Thus we can build a simplified map as in fig. IHni), in which 
these regions do not exist (indicated by shading). 

Partial information: It is believed that PCs form a cover for the environment the rat has 
explored, i.e. wherever the rat is, at least one PC will be firing. However for the SRE, an 
external observer is receiving signals from a set of electrodes, and only a subset of the PCs 
will actually be recorded and thus only partial information about the set of places active at 
any time. At some times, there may be no active PCs being recorded, and thus there will 
be "temporal gaps" . The question is, what can we say about the nature of the environment 
given only such partial information? In the wifi domain, presumably full information would 
always be available; there might still be temporal gaps, because no base station is in range, 
but that is different to not being able to detect a base station which is in range. One might 
want to regard the union of all locations where there is no base station in range as a k + 1st 
place. The discussion below concerns domains where there is only partial information. First 
we define the notion of a temporal gap: 
Gap(ti,t 2 ) =def 3(xi,x 2 )[At(ti,xi) A At(t 2 , x 2 )A 
V(t 3 ,z 3 )[[*i <t 3 At 3 < t 2 ] -> ->At{t 3 ,x 3 )] 
We can now write a rule which allows us to infer that the agent must be at at least one 
place during a gap, and that these places form a connected region which is itself connected 
to all the places where the agent is at at t\ and t 2 . 
V(£i, t 2 , £1, x 2 )[[Gap(fi,i 2 ) A At(£i, £1) A At(t 2 , x 2 )\ -> 
3x 3 [SumOfPlaces(x 3 ) A C(xi, x 3 ) A C(x 2 , x 3 ) A Con(x 3 )]] 

There might be more than one path x 3 linking the places at t\ and t 2 , however, we cannot 
infer this without evidence (such as metric information about speed and distance travelled, 
but we do not consider such possibilities here). 

There is (at least) a second way in which the underlying place information might be 
partial. We made the explicit assumption (in ^ITl) that time is sufficiently fine grained that 
no place transitions are missed. If this is not the case (because the speed of the agent is fast 
with respect to the recording granularity), then gaps may occur even if there are no missing 
place sensors. In this case we would need a modified version of the rule above, since it may 
be that no new places need to be inferred to fill in the gap, but rather that at least one of 
the places at time t± directly connects with at least one of the places at time t 2 . Some form 
of non monotonic reasoning is likely to be needed in general for reasoning in the presence of 
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such kinds of partial knowledge, in order to perform a domain closure, or to minimize the 
number of places assumed to exist (as in the spatial semantic hierarchy discussed in 

IV. FINAL COMMENTS 

We have discussed the problem of computing topological maps from knowledge of place 
trails, a task applicable at least in two identified domains. There are many ways in which 
this work could be extended. E.g. we could consider how to turn the symbolic topological 
representations into a graphic visualisation 7 automatically. Of course any such depiction will 
inevitably have metric qualities, but these must be ignored when interpreting the visualiza- 
tion 8 . Any qualitative spatial description will always have many metric realizations. One 



approach is to diagrammatic reasoning techniques (e.g. 10] ) on visualising Euler diagrams. 

We have already conducted some experimental work with artificial data and are currently 
collecting real data and will then evaluate the ideas sketched here, and refine them as 
appropriate. We may also consider other variants of the problem, e.g. scenarios with multiple 
agents (where the At predicate has a third argument indicating the agent). 
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